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Abstract 

Using one of the key property of copulas that they remain invariant under an arbitrary monotonous 
change of variable, we investigate the null hypothesis that the dependence between financial assets can be 
modeled by the Gaussian copula. We find that most pairs of currencies and pairs of major stocks are com- 
patible with the Gaussian copula hypothesis, while this hypothesis can be rejected for the dependence 
between pairs of commodities (metals). Notwithstanding the apparent qualification of the Gaussian cop- 
ula hypothesis for most of the currencies and the stocks, a non-Gaussian copula, such as the Student's 
copula, cannot be rejected if it has sufficiently many "degrees of freedom". As a consequence, it may 
be very dangerous to embrace blindly the Gaussian copula hypothesis, especially when the correlation 
coefficient between the pair of asset is too high as the tail dependence neglected by the Gaussian copula 
can be as large as 0.6, i.e., three out five extreme events which occur in unison are missed. 

JEL Classification: €12, €15, F31, G19 
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1 Introduction 



The determination of the dependence between assets underlies many financial activities, such as risk assess- 
ment and portfolio management, as well as option pricing and hedging. Following [ Markovitz (1959)| ], the 



covariance and correlation matrices have, for a long time, been considered as the main tools for quantifying 
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the dependence between assets. But the dimension of risk captured by the correlation matrices is only satis- 



fying for elliptic distributions and for moderate risk amplitudes JSornette et al. (2000a)| ]. In all other cases, 
this measure of risk is severely incomplete and can lead to a very strong underestimation of the real incurred 



risks [Embrechts et al. (1999)]. 



Although the unidimensional (marginal) distributions of asset returns are reasonably constrained by em- 
pirical data and are more or less satisfactorily described by a power law with tail index ranging between 2 and 



4 [De Vries (1994), ^ux (1996), pagan (1996), 3uillaume et al. (1997) 


, popikrishnan et al. (1998)] or by 


stretched exponentials [ 


Laherrere and Sornette (1998) 


, aourieroux and Jasiak (1999), 


Sornette et al. (2000a) 



Sornette et al. (2000b)p , no equivalent results have been obtained for multivariate distributions of asset re- 
turns. Indeed, a brute force determination of multivariate distributions is unreliable due to the limited data 
set (the curse of dimensionality), while the sole knowledge of marginals (one-point statistics) of each asset 
is not sufficient to obtain information on the multivariate distribution of these assets which involves all the 
n-points statistics. 

Some progress may be expected from the concept of copulas, recently proposed to be useful for financial 
apphcations [ ^mbrechts et al. (200 1)| , frees and Valdez (1998)] , [Haas (1999j| , [Klugman and Parsa (1999)| ]. 
This concept has the desirable property of decoupling the study of the marginal distribution of each asset 
from the study of their collective behavior or dependence. Indeed, the dependence between assets is en- 
tirely embedded in the copula, so that a copula allows for a simple description of the dependence structure 
between assets independently of the marginals. For instance, assets can have power law marginals and a 
Gaussian copula or alternatively Gaussian marginals and a non-Gaussian copula, and any possible combina- 
tion thereof. Therefore, the determination of the multivariate distribution of assets can be performed in two 
steps : (i) an independent determination of the marginal distributions using standard techniques for distribu- 
tions of a single variable ; (ii) a study of the nature of the copula characterizing completely the dependence 
between the assets. This exact separation between the marginal distributions and the dependence is poten- 
tially very useful for risk management or option pricing and sensitivity analysis since it allows for testing 
several scenarios with different kind of dependences between assets while the marginals can be set to their 



well-calibrated empirical estimates. Such an approach has been used by QEmbrechts et al. (2001 )| ] to pro- 
vide various bounds for the Value-at-Risk of a portfolio made of depend risks, and by [ Rosenberg (1999) ] 
or I Cherubini and Luciano (2000i| ] to price and to analyse the pricing sensitivity of binary digital options or 
options on the minimum of a basket of assets. 

A fundamental limitation of the copula approach is that there is in principle an infinite number of possible 
copulas QGenest and MacKay (1986] , |Genest (19871 |Genest and Ri vest (1993)| , Uoe (1993)| , |Nelsen (1998")| ] 
and, up to now, no general empirical study has determined the classes of copulas that are acceptable for 
financial problems. In general, the choice of a given copula is guided both by the empirical evidences and 
the technical constraints, i.e., the number of parameters necessary to describe the copula, the possibility to 
obtain efficient estimators of these parameters and also the possiblity offered by the chosen parameterization 
to allow for tractable analytical calculation. It is indeed sometimes more advantageous to prefer a simplest 
copula to one that fit better the data, provided that we can clearly quantify the effects of this substitution. 

In this vein, the first goal of the present article is to show that, in most cases, the Gaussian copula can 
provide an approximation of the unknown true copula that is sufficiently good so that it cannot be rejected 
on a statistical basis. Our second goal is to draw the consequences of the parameterization involved in the 
Gaussian copula in term of potential over/underestimation of the risks, in particular for large and extreme 
events. 

The paper is organized as follows. 

In section 2, we first recall some important general definitions and theorems about copulas that will 
be useful in the sequel. We then introduce the concept of tail dependence that will allow us to quantify the 
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probability that two extreme events might occur simultaneously. We define and describe the two copulas that 
will be at the core of our study : the Gaussian copula and the Student's copula and compare their properties 
particularly in the tails. 

In section 3, we present our statistical testing procedure which is applied to pairs of financial time series. 
First of all, we determine a test statistics which leads us to compare the empirical distribution of the data 
with a x^-distribution using a bootstrap method. We also test the sensitivity of our procedure by applying it 
to synthetic multivariate Student's time series. This allows us to determine the minimum statistical test value 
needed to be able to distinguish between a Gaussian and a Student's copula, as a function of the number of 
degrees of freedom and of the correlation strength. 

Section 4 presents the empirical results obtained for the following assets which are combined pairwise 
in the test statistics: 



• 6 currencies, 



• 6 metals traded on the London Metal Exchange, 

• 22 stocks choosen among the largest companies quoted on the New York Stocks Exchange. 



We show that the Gaussian copula hypothesis is very reasonnable for most stocks and currencies, while it is 
hardly compatible with the description of multivariate behavior for metals. 

Section 5 summarizes our results and concludes. 



2 Generalities about copulas 

2.1 Definitions and important results about copulas 

This section does not pretend to provide a rigorous mathematical exposition of the concept of copula. We 
only recall a few basic definitions and theorems that will be useful in the following (for more information 
about the concept of copula, see for instance [ Lindskog (1999) , t^Jelsen (1998^ ]). 



We first give the definition of a copula of n random variables. 
Definition 1 (Copula) 

A function C : [0,1]" — > [0, 1] is a n-copula if it enjoys the following properties : 

• Vue [0,1], C7(1,-..,1,M,1 •••,!) = u, 

• Vuj G [0, 1], C{ui, • • • , Un) = if at least one of the n, equals zero , 

• C is grounded and n-increasing, i.e., the C-volume of every boxes whose vertices lie in [0, 1]" is 
positive. 

It is clear from this definition that a copula is nothing but a multivariate distribution with support in [0,1]" 
and with uniform marginals. The fact that such copulas can be very useful for representing multivariate 
distributions with arbitrary marginals is seen from the following result. 
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Theorem 1 (Sklar's Theorem) 

Given an n-dimensional distribution function F with continuous marginal (cumulative) distributions Fi , • • • , 
there exists a unique n-copula C : [0,1]" — > [0, 1] such that : 

=C(Fi(xi),---,F„(x„)) . (1) 

This theorem provides both a parameterization of multivariate distributions and a construction scheme 
for copulas. Indeed, given a multivariate distribution F with marginals Fi, • • • , F„, the function 

C{ui,---,Un) = F[F{\ui),---,F-\un)) (2) 

is automatically a n-copula. This copula is the copula of the multivariate distribution F. We will use 
this method in the sequel to derive the expressions of standard copulas such as the Gaussian copula or the 
Student's copula. 

A very powerful property of copulas is their invariance under arbitrary strictly increasing mapping of 
the random variables : 

Theorem 2 (Invariance Theorem) 

Consider n continuous random variables Xi, • • • , Xn with copula C. Then, if gi{Xi), ■ ■ ■ ,gn{Xn) are 
strictly increasing on the ranges of Xi , • • • , Xn , the random variables Yi = gi{Xi) , ■ ■ ■ ,Yn = Qn ) have 
exactly the same copula C. 

It is this result that shows us that the full dependence between the n random variables is completely captured 
by the copula, independently of the shape of the marginal distributions. This result is at the basis of our 
statistical study presented in section 3. 



2.2 Dependence between random variables 

The dependence between two time series is usually described by their correlation coefficient. This mea- 
sure is fully satisfactory only for elliptic distributions [ p!mbrechts et al. (1999^ ], which are functions of a 



quadratic form of the random variables, when one is interested in moderately size events. However, an im- 
portant issue for risk management concerns the determination of the dependence of the distributions in the 
tails. Practically, the question is whether it is more probable that large or extreme events occur simultane- 
ously or on the contrary more or less independently. This is refered to as the presence or abscence of "tail 
dependence". 

The tail dependence is also an interesting concept in studying the contagion of crises between markets or 



countries. These questions have recently been addressed by [ ]Ang and Cheng (200 1)| , [Longin and Solnik (2001) 



Starica (1999)] among several others. Large negative moves in a country or market are often found to imply 



large negative moves in others. 

Technically, we need to determine the probability that a random variable X is large, knowing that the 
random variable Y is large. 

Definition 2 (Tail dependence 1) 

Let X and Y be random variables with continuous marginals Fx and Fy- The (upper) tail dependence 
coefficient of X and Y is, if it exists, 

lim Pr{X > F^^{u)\Y > FyHu)} = A G [0, 1] . (3) 

In words, given that Y is very large (which occurs with probability 1 — u), the probability that X is very 
large at the same probability level u defines asymptotically the tail dependence coefficient A. 
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It turns out that this tail dependence is a pure copula property which is independent of the marginals. Let C 
be the copula of the variables X and Y, then 



Theorem 3 

if the bivariate copula C is such that 



lim = A (4) 

u^l 1 — U 



exists (where C{u,u) = I — 2u — C{u, u)), then C has an upper tail dependence coefficient A. 

If A > 0, the copula presents tail dependence and large events tend to occur simultanously, with the 
probabilty A. On the contrary, when A = 0, the copula has no tail dependence in this sense and large events 
appear to occur essentially independently. There is however a subtlety in this definition of tail dependence. 
To make it clear, first consider the case where for large X and Y the distribution function F{x, y) factorizes 
such that 

hm = 1 . (5) 

This means that, for X and Y sufficiently large, these two variables can be considered as independent. It is 
then easy to show that 

lim Pr{X > K7^(u)|y > Fy^{u)} = lim 1 - (6) 

= lim 1 - ti = 0, (7) 

u—*l 

so that independent variables really have no tail dependence, as one can expect. 

Unfortunatly, the converse does not holds : a value A = does not automatically imply true indepen- 
dence, namely that F{x, y) satisfies equation (||). Indeed, the tail independence criterion A = may still be 
associated with an absence of factorization of the multivariate distribution for large X and Y. In a weaker 
sense, there may still be a dependence in the tail even when A = 0. Such behavior is for instance exhibited 
by the Gaussian copula, which has zero tail dependence according to the definition 2 but nevertheless does 
not have a factorizable multivariate distribution, since the non-diagonal term of the quadratic form in the 
exponential function does not become negligible in general as X and Y go to infinity. To summarize, the 
tail independence, according to definition 2, is not equivalent to the independence in the tail as defined in 
equation (^. 

After this brief review of the main concepts underlying copulas, we now present two special families of 
copulas : the Gaussian copula and the Student's copula. 



2.3 The Gaussian copula 

The Gaussian copula is the copula derived from the multivariate Gaussian distribution. Let $ denote the 
standard Normal (cumulative) distribution and „ the n-dimensional Gaussian distribution with correla- 
tion matrix p. Then, the Gaussian n-copula with correlation matrix p is 

whose density 
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reads 



Cp(Mi,---,M„) = --^^=exp ^-^y[„)(/9 ^-Id)2/(„)^ (10) 

with yk{u) = ^^^{uk). Note that theorem 1 and equation (^ ensure that Cp{ui, • • • , u„) in equation (||) is 
a copula. 

As we said before, the Gaussian copula does not have a tail dependence : 

lini^ii!^ =0, VpE (-1,1). (11) 

n^l 1 — n 



This results is derived for example in [ Embrechts et al. (200 1)[ ]. But this does not mean that the Gaussian 



copula goes to the independent (or product) copula Il{ui, U2) = ui ■ U2 when (ui, U2) goes to one. Indeed, 
consider a distribution F{x,y) with Gaussian copula : 

F{x,y)=Cp{Fx{x),FY{y)). (12) 

Its density is 

/(x,y) = Cp{Fx{x),FY{y)) ■ fx{x) ■ friy), (13) 
where fx and /y are the densities of X and Y. Thus, 

lim „ . = ^ lim c,{Fx{x),Fy{y)), (14) 

which should equal 1 if the variables X and Y were independent in the tail. Reasoning in the quantile space, 
we set X = F^^(u) and y = Fy ^(u), which yield 

fix y] 

Using equation ([l0|), it is now obvious to show that Cp(u, n) goes to one when u goes to one, if and only 
if p = which is equivalent to Cp=o{ui,U2) = n(ui, U2) for every (ui, n2). When p > 0, Cp(n, u) goes to 
infinity, while for p negative, Cp(n, u) goes to zero as u ^ 1. Thus, the dependence structure described by 
the Gaussian copula is very different from the dependence structure of the independent copula, except for 
p = 0. 

The Gaussian copula is completly determined by the knowledge of the correlation matrix p. The param- 
eters involved in the description of the Gaussian copula are very simple to estimate, as we shall see in the 
following. 

In our tests presented below, we focus on pairs of assets, i.e., on Gaussian copulas involving only two 
random variables. Testing the Gaussian copula hypothesis for two random variables gives useful information 
for a larger number of dependent variables constituting a large basket or portfolio. Indeed, let us assume 
that each pair (a, b), (6, c) and (c, a) have a gaussian copula. Then, the triplet (a, b, c) has also a Gaussian 
copula. This result generalizes to an arbitrary number of random variables. 



2.4 The Student's copula 

The Student's copula is derived from the Student's multivariate distribution. Given a multivariate Student's 
distribution Tp^,^ with v degrees of freedom and a correlation matrix p 

rp,.(x) = ^== ^ '.^ / •••/ (16) 



\/dit^r(|) (7ri/)^/2 J_ 



(1 + 2^ 



2 
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the corresponding Student's copula reads 



Cp,^{ui,--- ,Un) =Tp^u{tu^iui),--- ,t^^iUn)) , (17) 

where t,y is the univariate Student's distribution with u degrees of freedom. The density of the Student's 
copula is thus 



1 r(^)[r(|)]"-^nLi(i + 

I _|_ y*py 



2^ 



2 



where = t^^{uk). 

Since the Student's distribution tends to the normal distribution when u goes to infinity, the Student's 
copula tends to the Gaussian copula as v +cxd. In contrast to the Gaussian copula, the Student's copula 
for u finite presents a tail dependence given by : 



K{p) = lim 7^ ' ^ = 2t,+i " , (19) 

u^l l-U \ \/l + P J 

where tj^+i is the complementary cumulative univariate Student's distribution with u + 1 degrees of freedom 
(see [Embrechts et al. (2001)] for the proof). Figure |I| shows the upper tail dependence coefficient as a 



function of the correlation coefficient p for different values of the number u of degrees of freedom. As 
expected from the fact that the Student's copula becomes identical to the Gaussian copula for u — > +00 for 
all p / 1, A,y(p) exhibits a regular decay to zero as v increases. Moreover, for v sufficiently large, the tail 
dependence is significantly different from only when the correlation coefficient is sufficiently close to 1. 
This suggests that, for moderate values of the correlation coefficient, a Student's copula with a large number 
of degrees of freedom may be difficult to distinguish from the Gaussian copula from a statistical point of 
view. This statement will be made quantitative in the following. 

Figure § presents the same information in a different way by showing the maximum value of the cor- 
relation coefficient /) as a function of u, below which the tail dependence Xu{p) of a Student's copula is 
smaller than a given small value, here taken equal to 1%, 2.5%, 5% and 10%. The choice Xu{p) = 5% for 
instance corresponds to 1 event in 20 for which the pair of variables are asymptotically coupled. At the 
95% probability level, values of Xu{p) < 5% are undistinguishable from 0, which means that the Student's 
copula can be approximated by a Gaussian copula. 

The description of a Student's copula relies on two parameters : the correlation matrix p, as in the 
Gaussian case, and in addition the number of degrees of freedom u. The estimation of the parameter v 
is rather difficult and this has an important impact on the estimated value of the correlation matrix. As a 
consequence, the Student's copula is more difficult to calibrate and use than the Gaussian copula. 



3 Testing the Gaussian copula hypothesis 

In view of the central role that the Gaussian paradigm has played and still plays in particular in finance, 
it is natural to start with the simplest choice of dependence between different random variables, namely 
the Gaussian copula. It is also a natural first step as the Gaussian copula imposes itself in an approach 
which consists in (1) performing a nonlinear transformation on the random variables into Normal ran- 
dom variables (for the marginals) which is always possible and (2) invoking a maximum entropy principle 
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(which amounts to add the least additional information in the Shannon sense) to construct the multivari- 
able distribution of these Gaussianized random variables [ Sornette et al. (2000a) , Sornette et al. (2000b )| , 



Andersen and Sornette (2001)]. 



In the sequel, we will denote by Hq the null hypothesis according to which the dependence between two 
(or more) random variables X and Y can be described by the Gaussian copula. 



3.1 Test Statistics 

We now derive the test statistics which will allow us to reject or not our null hypothesis Hq and state the 
following proposition: 

Proposition 1 

Assuming that the A^-dimensionnal random vector x = (xi, ■ ■ ■ ,xn) with distribution function F and 
marginals Fi, satisfies the null hypothesis Hq, then, the variable 

N 

Z'=Y1 '^''iF^{x^)) ^-\Fj{Xj)), (20) 

where the matrix p is 

Pij = CoY[^-\Fi{xi)),^-\Fj{xj))l (21) 
follows a -distribution with degrees of freedom. 

To proove the proposition above, first consider an A^-dimensionnal random vector x = (xi, • • • , xat). 
Let us denote by F its distribution function and by Fi the marginal distribution of each Xj. Let us now assume 
that the distribution function F satisfies Hq, so that F has a Gaussian copula with correlation matrix p while 
the Fj's can be any distribution function. According to theorem 1, the distribution F can be represented as : 

F{xi, - ■ ■ ,xn) = ^p,n{^-\Fi{xi)), - ■ ■ ,^-\Fn{xn))) ■ (22) 
Let us now transform the Xj 's into Normal random variables y/s : 

= ^'\F,{x,)) . (23) 

Since the mapping <^>^^(Fj(-)) is obviously increasing, theorem 2 allows us to conclude that the copula of 
the variables yj's is identical to the copula of the variables Xj's. Therefore, the variables y^'s have Normal 
marginal distributions and a Gaussian copula with correlation matrix p. Thus, by definition, the multivariate 
distribution of the i/j's is the multivariate Gaussian distribution with correlation matrix p : 

G{y) = ^pM^-\Fi{xi)),---,<^-\Fn(,xn))) (24) 
= ^p,N{yir-- ,yN), (25) 

and y is a Gaussian random vector. From equations (|2^-^5]), we obviously have 

Pij = CoY[<^-\Fi{xi)),^-\Fj{xj))]. (26) 

Consider now the random variable 

N 

2J = 1 
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where •* denotes the transpose operator. This variable has akeady been considered in | Sornette et al. (2000a) ] 



in preliminary statistical tests of the transformation (23). It is well-known that the variable follows a x^- 
distribution with N degrees of freedom. Indeed, since y is a Gaussian random vector with covariance matrixQ 
p, it follows that the components of the vector 



-1/2 



y, 



(28) 



are independent Normal random variables. Here, p^^l'^ denotes the square root of the matrix p^^, which can 
be obtain by the Cholevsky decomposition, for instance. Thus, the sum y*y = is the sum of the squares 
of N independent Normal random variables, which follows a -distribution with N degrees of freedom. 



3.2 Testing procedure 

The testing procedure used in the sequel is now described. We consider two financial series {N = 2) 
of size T: {xi(l), • • • , • • • , xi(T)} and {x2(l), • • • , X2(t), • ' ' ) X2^y\- We assume that the vectors 
x(t) = (xi(t), X2(i)), t G {1, • • • , T} are independent and identicaly distributed with distribution F, which 
implies that the variables x\(t) (respectively X2{t)), t € {1, • • • , T}, are also independent and identicaly 
distributed, with distributions Fi (respectively F2). 



The cumulative distribution Fj of each variable Xi, which is estimated empirically, is given by 

T 



1 ^ 



fc=i 



where is the indicator function, which equals one if its argument is true and zero otherwise. We use 
these estimated cumulative distributions to obtain the Gaussian variables iji as : 

Uk) = ^-UF,{x,{k))) ke{l,---,T}. (30) 



The sample covariance matrix p is estimated by the expression : 

T 



p = ^Ey«-y«* (31) 



T 

1=1 



which allows us to calculate the variable 



z\k) = Uk) {p-'h yjik) , (32) 

as defined in (27) for k € {!,•■■ )^}> which should be distributed according to a ^^-distribution if the 
Gaussian copula hypothesis is correct. 

The usual way for comparing an empirical with a theoretical distribution is to measure the distance be- 



tween these two distributions and to perform the Kolmogorov test or the Anderson-Darling [Anderson and Darling (1952)] 
test (for a better accuracy in the tails of the distribution). The Kolmogorov distance is the maximum local 
distance along the quantile which most often occur in the bulk of the distribution, while the Anderson- 
Darling distance puts the emphasis on the tails of the two distributions by a suitable normalization. We 

'Up to now, the matrix p was named correlation matrix. But in fact, since the variables j/i's have unit variance, their correlation 
matrix is also their covariance matrix. 
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propose to complement these two distances by two additional measures which are defined as averages of the 
Kolmogorov distance and of the Anderson-Darling distance respectively: 

Kolmogorov: di = m.ax.\F^2(z'^) — Fy2(z'^)\ (33) 

average Kolmogorov : d2 = J \F^2{z'^) — F^2{z'^)\ dF^2{z'^) (34) 

\F,2{z^) - F^2{z^)\ 

Anderson — Darling : d-^ = max — (35) 



lF^2iz^)[l-F^2iz^) 

f |F,2(z2) -F^2(z2)| ^ 2^ 

average Anderson — Darling : d^ = / — dF^2 [z ) (36) 

J ^ F^2{z^)[l - F^2{z^)] 

The Kolmogorov distance di and its average d2 are more sensitive to the deviations occurring in the bulk of 
the distributions. In contrast, the Anderson-Darling distance ^3 and its average d^ are more accurate in the 
tails of the distributions. We present our statistical tests for these four distances in order to be as complete 
as possible with respect to the different sensitivity of the tests. 

The distances ^2 and ^4 are not of common use in statistics, so let us justify our choice. One usually uses 
distances similar to d2 and d^ but which differ by the square instead of the modulus of F^2 {z^) — F^2 (z^) and 
lead respectively to the i^-test and the il-test, whose statitics are theoretically known. The main advantage 
of the distances d2 and d^ with respect to the more usual distances to and Q is that they are simply equal 
to the average of di and ^3. This averaging is very interesting and provides important information. Indeed, 
the distances di and ^3 are mainly controlled by the point that maximizes the argument within the max(-) 
function. They are thus sensitive to the presence of an outlier. By averaging, ^2 and d^ become less sensitive 
to outliers, since the weight of such points is only of order 1/T (where T is the size of the sample) while 
it equals one for di and ^3. Of course, the distances lu and Q also perform a smoothing since they are 
averaged quantities too. But they are the average of the square of di and d^ which leads to an undesired 
overweighting of the largest events. In fact, this weight function is chosen as a convenient analytical form 
that allows one to derive explicitely the theoretical asymptotic statistics for the uj and ri-tests. In contrast, 
using the modulus of F^2{z'^) — F^2{z'^) instead of its square in the expression of d2 and d^, no theoretical 
test statistics can be derived analytically. In other words, the presence of the square instead of the modulus of 
F^2 {z'^)—F^2 (z^) in the definition of the distances lu and Q, is motivated by mathematical convenience rather 
than by statistical pertinence. In sum, the sole advantage of the standard distances to and Q with respect to 
the distances d2 and d^ introduced here is the theoretical knowledge of their distributions. However, this 
advantage disappears in our present case in which the covariance matrix is not known a priori and needs to 
be estimated from the empirical data: indeed, the exact knowledge of all the parameters is necessary in the 
derivation of the theoretical statistics of the cu and fi-tests (as well as the Kolmogorov test). Therefore, we 
cannot directly use the results of these standard statistical tests. As a remedy, we propose a bootstrap method 
[ pfron and Tibshirani (1986) ], whose accuracy is proved by [ phen and Lo (1997) | to be at least as good as 



that given by asymptotic methods used to derive the theoretical distributions. For the present work, we have 
determined that the generation of 10,000 synthetic time series was sufficient to obtain a good approximation 
of the distribution of distances described above. Since a bootstrap method is needed to determine the tests 
statistics in every case, it is convenient to choose functional forms different from the usual ones in the lu and 
il-tests as they provide an improvement with respect to statistical reliability, as obtained with the d2 and d^ 
distances introduced here. 

To summarize, our test procedure is as follows. 

1. Given the original time series x(t), t G {1, • • • ,T}, we generate the Gaussian variables y{t), t € 
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2. We then estimate the covariance matrix p of the Gaussian variables y, which allows us to compute the 
variables and then measure the distance of its estimated distribution to the x^-distribution. 

3. Given this covariance matrix p, we generate numerically a time series of T Gaussian random vectors 
with the same covariance matrix p. 

4. For the time series of Gaussian vectors synthetically generated with covariance matrix p, we estimate 
its sample covariance matrix p. 

5. To each of the T vectors of the synthetic Gaussian time series, we associate the corresponding real- 
ization of the random variable z^, called z^{t). 

6. We can then construct the empirical distribution for the variable and measure the distance between 
this distribution and the x^-distribution. 

7. Repeating 10,000 times the steps 3 to 6, we obtain an accurate estimate of the cumulative distri- 
bution of distances between the distribution of the synthetic Gaussian variables and the theoretical 
X^-distribution. 

8. Then, the distance obtained at step 2 for the true variables can be transformed into a significance level 
by reading the value of this synthetically determined distribution of distances between the distribution 
of the synthetic Gaussian variables and the theoretical -distribution as a function of the distance: 
this provides the probability to observe a distance smaller than the chosen or empirically determined 
distance. 



3.3 Sensitivity of the method 

Before presenting the statistical tests, it is important to investigate the sensitivity of our testing procedure. 
More precisely, can we distinguish for instance between a Gaussian copula and a Student's copula with a 
large number of degrees of freedom, for a given value of the correlation coefficient? Formaly, denoting by 
Hy the hypothesis according to which the true copula of the data is the Student's copula with v degrees of 
freedom, we want to determine the minimum significance level allowing us to distinguish between Hq and 



3.3.1 Importance of the distinction between Gaussian and Student's copulas 



This question has important practical implications because, as discussed in section 2.4, the Student's copula 
presents a significant tail dependence while the Gaussian copula has no asymptotic tail dependence. There- 
fore, if our tests are unable to distinguish between a Student's and a Gaussian copula, we may be led to 
choose the later for the sake of simplicity and parsimony and, as a consequence, we may underestimate 
severely the dependence between extreme events if the correct description turns out to be the Student's 
copula. This may have catastrophic consequences in risk assessment and portfolio management. 

Figure |l] provides a quantification of the dangers incurred by mistaking a Student's copula for a Gaussian 
one. Consider the case of a Student's copula with u = 20 degrees of freedom with a correlation coefficient 
p lower than 0.3 ~ 0.4 ; its tail dependence X,y{p) turns out to be less than 0.7%, i.e., the probability that 
one variable becomes extreme knowing that the other one is extreme is less than 0.7%. In this case, the 
Gaussian copula with zero probability of simultaneous extreme events is not a bad approximation of the 
Student's copula. In contrast, let us take a correlation p larger than 0.7 — 0.8 for which the tail dependence 
becomes larger than 10%, corresponding to a non-neghgible probability of simultaneous extreme events. 
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The effect of tail dependence becomes of course much stronger as the number v of degrees of freedom 
decreases. 

These examples stress the importance of knowing whether our testing procedure allows us to distinguish 
between a Student's copula with = 20 (or less) degrees of freedom and a given correlation coefficient 
p = 0.5, for instance, and a Gaussian copula with an appropriate correlation coefficient p' . 



3.3.2 Statistical test on the distinction between Gaussian and Student's copulas 

To address this question, we have generated 1,000 pairs of time series of size T = 1250, each pair of random 
variables following a Student's bivariate distribution with v degrees of freedom and a correlation coefficient 
p between the two simultaneous variables of the same pair, while the variables along the time axis are all 
independent. We have then applied the previous testing procedure to each of the pairs of time series. 

Specifically, for each pair of time series, we construct the marginals distributions and transform the 
Student's variables Xiik) into their Gaussian counterparts yj(A;) via the transformation (|^). For each pair 
{yi{k),y2{k)), k G {1, • • • ,T}, we estimate its correlation matrix, then construct the time series with T 



realizations of the random variable z (k) defined in (g7|). The set of T variables z then allows us to 



construct the distribution of (with N = 2) and to compare it with the x^-distribution with two degrees of 
freedom. We then measure the distances di, d2, and d^ defined by (33-36) between the distribution of 



and the -distribution. Using the 1,000 pairs of such time series with the same u and p, we then construct 
the distribution Di{di), i G {1,2,3,4} of each of these distances d^. Using the previously determined 
distribution of distances expected for the synthetic Gaussian variables, we can translate each distance d 
obtained for the Student's vectors into a corresponding Gaussian probability p: p is the probability that pairs 
of Gaussian random variables with the correlation coefficient p have a distance equal to or larger than the 
distance obtained for the Student's vector time series. A small p corresponds to a clear distinction between 
Student's and Gaussian vectors, as it is improbable that Gaussian vectors exhibit a distance larger than 
found for the Student's vectors. The "distribution of probabilities" D{p) = D{p{d)) then assesses how 
often this "improbable" event occurs among the set of 1,000 Student's vectors, i.e., attempts to quantify the 
rarety of such large deviations. In other words, the "distribution of probabilities" D{p) gives the number of 
Student's vectors that exhibit the value p for the probability that Gaussian vectors can have a similar or larger 
distance. Then, fixing a confidence level D*, this procedure allows us to reject or not the null hypothesis 
that the empirical vector of returns is described by a Gaussian copula: this will occur when the observed p 
gives a "distribution of probabilities" D{p) larger than D*. 

The "distributions of probabilities" D{p) for each of the four distances di,i G {1, 2, 3, 4} are shown in 
figure |fori/ = 4de grees of freedom and in figure |for 1/ = 20 de grees of freedom, for 5 different values 
of the correlation coefficient p = 0.1, 0.3, 0.5, 0.7 and 0.9. The very steep increase observed for almost all 
cases in figure |3| reflects the fact that most of the 1,000 Student's vectors with v = A degrees of freedom 
have a small p, i.e., their copula is easily distinguishable from the Gaussian copula. The same cannot be 
stated for Student's vectors with i/ = 20 degrees of freedom. Note also that the distances di, d2 and d^ 
give essentially the same result while the Anderson-Darling distance ^3 is more sensitive to p, especially for 
small u. 

Fixing for instance the confidence level at D* = 95%, we can read from each of these curves in figures 
^ and ^ the minimum p95%-value necessary to distinguish a Student's copula with a given u from a Gaussian 
copula. This ^95% is the abscissa corresponding to the ordinate D{pgr^%) = 0.95. These values ^95% are 
reported in table [T], for different values of the number u of degrees of freedom ranging from = 3 to = 50 
and correlation coefficients p = 0.1 to 0.9. The values of P95%(z^, p) reported in table |I] are the maximum 
values that the probability p should take in order to be able to reject the hypothesis that a Student's copula 



12 



with V degrees and correlation p can be mistaken with a Gaussian copula at the 95% confidence level. 

The results of the table [T] are depicted in figures ||-^ and represent the conventional "power/size" statis- 
tics. The statistical "power" is usually defined as the rejection of null hypothesis when false. When the null 
hypothesis Hq and the alternative hypothesis are identical, the power should be equal to = 0.05, corre- 
sponding to the 95% confidence level. In our framework, this amounts to plot the abscissa as the inverse v^^ 
of the number v of degrees of freedom, which provides a natural "distance" between the Gaussian copula 
hypothesis and the Student's copula hypothesis Hy. In the ordinate, the "power" is represented by the 
minimum significance level (1 — ^95%) necessary to distinguish between and Hy. The typical shape 
of these curves is a sigmoid, starting from a very small value for v^^ 0, increasing as v^^ increases 
and going to 1 as v^^ becomes large enough. This typical shape simply expresses the fact that it is easy to 
separate a Gaussian copula from a Student's copula with a small number of degrees of freedom, while it is 
difficult and even impossible for too large a number of degrees of freedom. 

The figure || shows us that the distances d\ , ^2 and ds are not sensitive to the value of the correlation 
coefficient p, while the discriminating power of ^3 increases with p. On figure ^ we note that ^2 and ^4 
have the same discriminating power for all p's (which makes them somewhat redundant) and that they are 
the most efficient to differentiate from i/g for small p. When p is about 0.5, ^2, ^3 and ^4 (and maybe d\) 
are equivalent with respect to the differential power, while for large p, d?j becomes the most discriminating 
one with high significance. 

This study of the test sensitivity involves a non-parametric approach and the question may arise why it 
should be prefered to a direct parametric test involving for instance the calibration of the Student copula. 
First, a parametric test of copulas would face the "curse of dimensionality", i.e., the estimation of functions 
of several variables. With the limited data set available, this does not seem a reasonable approach. Second, 
we have taken the Student copula as an example of an alternative to the Gaussian copula. However, our tests 
are independent of this choice and aim mainly at testing the rejection of the Gaussian copula hypothesis. 
They are thus of a more general nature than would be a parametric test which would be forced to choose 
one family of copulas with the problem of excluding others. The parametric test would then be exposed to 
the criticism that the rejection of a given choice might not be of a general nature. 

In the sequel, we will choose the level of 95% as the level of rejection, which leads us to neglect one 
extreme event out of twenty. This is not unreasonable in view of the other significant sources of errors 
resulting in particular from the empirical determination of the marginals and from the presence of outliers 
for instance. 

4 Empirical results 

We investigate the following assets : 

• foreign exchange rates, 

• metals traded on the London Metal Exchange, 

• stocks traded on the New York Stocks Exchange. 
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4.1 Currencies 



The sample we have considered is made of the daily returns for the spot foreign exchanges for 6 currencies^ : 
the Swiss Franc (CHF), the German Mark (DEM), the Japanese Yen (JPY), the Malaysian Ringgit (MYR), 
the Thai Baht (THA) and the Bristish Pound (UKP). All the exchange rates are expressed against the US 
dollar. The time interval runs over ten years, from January 25, 1989 to December 31, 1998, so that each 
sample contains 2500 data points. 

We apply our test procedure to the entire sample and to two sub-samples of 1250 data points so that 
the first one covers the time interval from January 25, 1989 to January 11, 1994 and the second one from 
January 12, 1994 to December 31, 1998. The results are presented in tables § to ^ and depicted in figures ^ 



Tables give, for the total time interval and for each of the two sub-intervals, the probability p{d) to 
obtain from the Gaussian hypothesis a deviation between the distribution of the and the ^^-distribution 
with two degrees of freedom larger than the observed one for each of the 15 pairs of currencies according to 



the distances di-d^ defined by ([33[)-(|3q). 

The figures organize the information shown in the tables by representing, for each distance di 
to di, the number of currency pairs that give a test-value p within a bin interval of width 0.05. A clustering 
close to the origin signals a significant rejection of the Gaussian copula hypothesis. 

At the 95% significance level, table ^ and figure show that only 40% (according to di and ds) but 
60% (according to d2 and ^4) of the tested pairs of currencies are compatible with the Gaussian copula 
hypothesis over the entire time interval. During the first half -period from January 25, 1989 to Januray 11, 
1994 (tableland figure |8|), 47% (according to ^3) and up to about 75 % (according to ^2 and ^4) of the tested 
currency pairs are compatible with the assumption of Gaussian copula, while during the second sub-period 
from January 12, 1994 to December 31, 1998 (table § and figure ||), between 66% (according to di) and 
about 75% (according to d2, d^ and ^4) of the currency pairs remain compatible with the Gaussian copula 
hypothesis. These results raise several comments both on a statistical and an economic point of view. 

We first note that the most significant rejection of the Gaussian copula hypothesis is obtained for the 
distance ds, which is indeed the most sensitive to the events in the tail of the distributions. The test statistics 
given by this distance can indeed be very sensitive to the presence of a single large event in the sample, so 
much so that the Gaussian copula hypothesis can be rejected only because of the presence of this single event 
(outlier). The difference between the results given by d^ and d^ (the averaged ^3) are very significant in this 
respect. Consider for instance the case of the German Mark and the Swiss Franc. During the time interval 
from January 12, 1994 to December 31, 1998, we check on table ^ that the non-rejection probability p{d) is 
very significant according to di, d2 and d^ ip{d) > 31%) while it is very low according to ^3: p{d) = 0.05%, 
and should lead to the rejection of the Gaussian copula hypothesis. This suggests the presence of an outlier 
in the sample. 

To check this hypothesis, we show in the upper panel of figure |lO|the function 

m.iffmzidm, (37) 



used in the definition of the Anderson-Darling distance ^3 = max^ fsiz) (see definition (p5|)), expressed in 
terms of time t rather than z^. The function have been computed over the two time sub-intervals separately. 

Apart from three extreme peaks occurring on June 20, 1989, August 19, 1991 and September 16, 1992 



The data come from the historical database of the Federal Reserve Board. 
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during the first time sub-interval and one extreme peak on September 10, 1997 during the second time sub- 
interval, the statistical fluctuations measured by /3(t) remain small and of the same order. Excluding the 
contribution of these outlier events to ^3, the new statistical significance derived according to ^3 becomes 



similar to that obtained with di , ^2 and on each sub-interval. From the upper pannel of figure [10|, it is 
clear that the Anderson-Darling distance ^3 is equal to the height of the largest peak corresponding to the 
event on August 19, 1991 for the the first period and to the event on September 10, 1997 for the second 



period. These events are depicted by a circled dot in the two lower panels of figure 10, which represent the 



return of the German Mark versus the return of the Swiss Franc over the two considered time periods. 

The event on August 19, 1991 is associated with the coup against Gorbachev in Moscow: the German 
mark (respectively the Swiss franc) lost 3.37% (respectively 0.74%) in daily annualized value against the 
US dollar. The 3.37% drop of the German Mark is the largest daily move of this currency against the US 
dollar over the whole first period. On September 10, 1997, the German Mark appreciated by 0.60% against 
the US dollar while the Swiss Franc lost 0.79% which represents a moderate move for each currency, but a 
large joint move. This event is related to the contradictory announcements of the Swiss National Bank about 
the monetary policy, which put an end to a rally of the Swiss Franc along with the German mark against the 
US dollar. 

Thus, neglecting the large moves associated with major historical events or events associated with un- 
expected incoming information, which cannot be taken into account by a statistical study, we obtain, for 
ds, significance levels compatible with those obtained with the other distances. We can thus conclude that, 
according to the four distances, during the time interval from January 12, 1994 to December 31, 1998 the 
Gaussian copula hypothesis cannot be rejected for the couple German Mark / Swiss Franc. 

However, the non-rejection of the Gaussian copula hypothesis does not always have minor consequences 



and may even lead to serious problem in stress scenarios. As shown in section 3.3, the non-rejection of 
the Gaussian copula hypothesis does not exclude, at the 95% significance level, that the dependence of the 
currency pairs may be accounted for by a Student's copula with adequate values of v and p. Still considering 
the pair German Mark / Swiss Franc, we see in table |l] that, according to di, ^2 and ^4, a Student's copula 
with about five degrees of freedom allows to reach the test values given in table ^. But, with the correlation 
coefficient p = 0.92 for the German Mark/Swiss Franc couple, the Gaussian copula assumption could lead 
to neglect a tail dependence coefficient A5(0.92) = 63% according to the Student's copula prediction. Such 
a large value of A5(0.92) means that when an extreme event occurs for the German Mark it also occurs for 
the Swiss Franc with a probabilty equals to 0.63. Therefore, a stress scenario based on a Gaussian copula 
assumption would fail to account for such coupled extreme events, which may represent as many as two 
third of all the extreme events, if it would turn out that the true copula would be the Student's copula with 
five degrees of freedom. In fact, with such a value of the correlation coefficient, the tail dependence remains 
high even if the number of degrees of fredom reach twenty or more (see figure [l]). 

The case of the Swiss Franc and the Malaysian Ringgit offers a striking difference. For instance, in the 
second half -period, the test statistics p{d) are greater than 70% and even reach 91% while the correlation 
coefficient is only p = 0.16, so that a Student's copula with 7-10 degrees of freedom can be mistaken with 
the Gaussian copula (see table |l|). Even in the most pessimistic situation u = 7, the choice of the Gaussian 
copula amounts to neglecting a tail dependence coefficient A5(0.16) = 4% predicted by the Student's cop- 
ula. In this case, stress scenarios based on the Gaussian copula would predict uncoupled extreme events, 
which would be shown wrong only once out of twenty five times. 

These two examples show that, more than the number of degrees of freedom of the Student's copula 
necessary to describe the data, the key parameter is the correlation coefficient. 

^From an economic point of view, the impact of regulatory mechanisms between currencies or monetary 
crisis can be well identified by the rejection or absence of rejection of our null hypothesis. Indeed, consider 
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the couple German Mark / British Pound. During the first half period, their correlation coefficient is very 
high (p = 0.82) and the Gaussian copula hypothesis is strongly rejected according to the four distances. 
On the contrary, during the second half period, the correlation coefficient significantly decreases (p = 0.56) 
and none of the four distances allows us to reject our null hypothesis. Such a non-stationarity can be easily 
explained. Indeed, on January 1, 1990, the British Pound entered the European Monetary System (EMS), 
so that the exchange rate between the German Mark and the Bristish Pound was not allowed to fluctuate 
beyond a margin of 2.25%. However, due to a strong speculative attack, the British Pound was devaluated on 
September 1992 and had to leave the EMS. Thus, between January 1990 and September 1992, the exchange 
rate of the German Mark and the British Pound was confined within a narrow spread, incompatible with the 
Gaussian copula description. After 1992, the British Pound exchange rate floated with respect to German 
Mark, the dependence between the two currencies decreased, as shown by their correlation coefficient. In 
this regime, we can no more reject the Gaussian copula hypothesis. 

The impact of major crisis on the copula can be also clearly identified. Such a case is exhibited by the 
couple Malaysian Ringgit/Thai Baht. Indeed, during the period from Januray 1989 to January 1994, these 
two currencies have only undergone moderate and weakly correlated (p = 0.29) fluctuations, so that our null 
hypothesis cannot be rejected at the 95% significance level. On the contrary, during the period from January 
1994 to October 1998, the Gaussian copula hypothesis is strongly rejected. This rejection is obviously due 
to the persistent and dependent (p = 0.44) shocks incured by the Asian financial and monetary markets 
during the seven months of the Asian Crisis from July 1997 to January 1998 [ Baig and Goldfajn (1998)| , 



Kaminsky and Schlmukler (1999)0 . 



These two cases show that the Gaussian copula hypothesis can be considered reasonable for currencies 
in absence of regulatory mechanisms and of strong and persistent crises. They also allows us to understand 
why the results of the test over the entire sample are so much weaker than the results obtained for the two 
sub-intervals: the time series are strongly non-stationnary. 



4.2 Commodities: metals 

We consider a set of 6 metals traded on the London Metal Exchange: aluminium, copper, lead, nickel, tin 
and zinc. Each sample contains 2270 data points and covers the time interval from January 4, 1989 to 
December 30, 1997. The results are synthetized in table |5] and in figure 11. 



Table || gives, for each of the 15 pairs of commodities, the probability p{d) to obtain from the Gaussian 
hypothesis a deviation between the distribution of the and the -distribution with two degrees of freedom 



larger than the observed one for the commodity pair according to the distances di-d4 defined by ([33|)-(|36|). 

The figure [l^ organizes the information shown in table |5| by representing, for each distance, the number 
of commodity pairs that give a test-value p within a bin interval of width 0.05. A clustering close to the 
origin signals a significant rejection of the Gaussian copula hypothesis. 

According to the thr^ee distances di, d2 and d^, at least two third and up to 93% of the set of 15 pairs of 
commodities are inconsistent with the Gaussian copula hypothesis. Surprisingly, according to the distance 
ds, at the 95% significance level, two third of the set of 15 pairs of commodities remain compatible with the 
Gaussian copula hypothesis. This is the reverse to the previous situation found for currencies. These test 
values lead to globally reject the Gaussian copula hypothesis. 

Moreover, the largest value obtained for the distance d-^ is p = 65% for the pair copper-tin, which is 
significantly smaller than the 80% or 90% reached for some currencies over a similar time interval. Thus, 
even in the few cases where the Gaussian copula assumption is not rejected, the test values obtained are not 
really sufficient to distinguish between the Gaussian copula and a Student's copula with = 5 ~ 6 degrees 
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of freedom. In such a case, with correlation coefficients ranging between 0.31 and 0.46, the tail dependence 
neglected by keeping the Gaussian copula is no less than 10% and can reach 15%. One extreme event out of 
seven or ten might occur simultaneously on both marginals, which would be missed by the Gaussian copula. 

To summarize, the Gaussian copula does not seem a reasonnable assumption for metals, and it has not 
appeared necessary to test these data over smaller time interval. 



4.3 Stocks 

We now study the daily returns distibutions for 22 stocks among the largest compagnies quoted on the New 
York Stock Exchange|]: Appl. Materials (AMAT), AT&T (T), Citigroup (C), Coca Cola (KO), EMC, Exxon- 
Mobil (XOM), Ford (F), General Electric (GE), General Motors (GM), Hewlett Packard (HPW), IBM, Intel 
(INTO, MCI WorldCom (WCOM), Medtronic (MDT), Merck (MRK), Microsoft (MSFT), Pfizer (PEE), 
Procter&Gamble (PG), SBC Communication (SBC), Sun Microsystem (SUNW), Texas Instruments (TXN), 
Wal Mart (WMT). 

Each sample contains 2500 data points and covers the time interval from February 8, 1991 to December 
29, 2000 and have been divided into two sub-samples of 1250 data points, so that the first one covers the time 
interval from February 8, 1991 to January 18, 1996 and the second one from January 19, 1996 to December 
20, 2000. The results of fifteen randomly chosen pairs of assets are presented in tables ^ to |8| while the 
results obtain for the entire set are represented in figures IHto 0. 



At the 95% significance level, figure [12| shows that 75% of the pairs of stocks are compatible with the 
Gaussian copula hypothesis. Figure 13 shows that over the time interval from February 1991 to January 
1996, this percentage becomes larger than 99% for di, d2 and while it equals 94% according to ^3. It 
is striking to note that, during this period, according to di, d2 and d^, more than a quarter of the stocks 
obtain a test- value p larger than 90%, so that we can assert that they are completely inconsistent with the 
Student's copula hypothesis for Student's copulas with less than 10 degrees of freedom. Among this set 
of stocks, not a single one has a correlation coefficient larger than 0.4, so that a scenario based on the 
Gaussian copula hypothesis leads to neglecting a tail dependence of less than 5% as would be predicted by 
the Student's copula with 10 degrees of freedom. In addition, about 80% of the pairs of stocks lead to a 
test- value p larger than 50% according to the distances di, d2 and d^, so that as much as 80% of the pairs of 
stocks are incompatible with a Student's copula with a number of degrees of freedom less than or equal to 5. 
Thus, for correlation coefficients smaller than 0.3, the Gaussian copula hypothesis leads to neglecting a tail 
dependence less than 10%. For correlation coefficients smaller than 0.1 which corresponds to 13% of the 
total number of pairs, the Gaussian copula hypothesis leads to neglecting a tail dependence less than 5%. 



Figure 14 shows that, over the time interval from January 1996 to December 2000, 92% of the pairs of 
stocks are compatible with the Gaussian copula hypothesis according to di , d2 and d^ and more than 79% 
according to ^3. About a quarter of the pair of stocks have a test-value p larger than 50% according to the 
four measures and thus are inconsistent with a Student's copula with less than five degrees of freedom. 

For completeness, we present in table ^ the results of the tests performed for five stocks belonging to 
the computer area : Hewlett Packard, IBM, Intel, Microsoft and Sun Microsystem. We observe that, during 
the first half period, all the pairs of stocks qualify the Gaussian copula Hypothesis at the 95% significance 
level. The results are rather different for the second half period since about 40% of the pairs of stocks reject 
the Gaussian copula hypothesis according to di, ^2 and ^3. This is probably due to the existence of a few 
shocks, notably associated with the crash of the "new economy" in March-April 2000. 

On the whole, it appears however that there is no systematic rejection of the Gaussian copula hypothesis 



'The data come from the Center for Research in Security Prices (CRSP) database. 



17 



for stocks within the same industrial area, notwithstanding the fact that one can expect stronger correlations 
between such stocks than for currencies for instance. 



5 Conclusion 

We have studied the nuU hypothesis that the dependence between pairs of financial assets can be modeled 
by the Gaussian copula. 

Our test procedure is based on the following simple idea. Assuming that the copula of two assets X 
and Y is Gaussian, then the multivariate distribution of {X, Y) can be mapped into a Gaussian multivariate 
distribution, by a transformation of each marginal into a normal distribution, which leaves the copula of 
X and Y unchanged. Testing the Gaussian copula hypothesis is therefore equivalent to the more standard 
problem of testing a two-dimensional multivariate Gaussian distribution. We have used a bootstrap method 
to determine and caUbrate the test statistics. Four different measures of distances between distributions, 
more or less sensitive to the departure in the bulk or in the tail of distributions, have been proposed to 
quantify the probability of rejection of our null hypothesis. 

Our tests have been performed over three types of assets: currencies, commodities (metals) and stocks. 
In most cases, for currencies and stocks, the Gaussian copula hypothesis can not be rejected at the 95% 
confidence level. For currencies, according to three of the four distances at least, 

• 40% of the pairs of currencies, over a 10 years time interval (due to non-stationnary data), 

• 67% of the pairs of currencies, over the first 5 years time interval, 

• 73% of the pairs of currencies, over the second 5 years time interval, 

are compatible with the Gaussian copula hypothesis. For stocks, we have shown that 

• 75% of the pairs of stocks, over a 10 years time interval, 

• 93% of the pairs of stocks, over the first 5 years time interval, 

• 92% of the pairs of stocks, over the second 5 years time interval, 

are compatible with the Gaussian copula hypothesis. In constrast, the Gaussian copula hypothesis caimot be 
considered as reasonable for metals : between 66% and 93% of the pairs of metals reject the null hypothesis 
at the 95% confidence level. 

Notwithstanding the apparent qualification of the Gaussian copula hypothesis for most of the currencies 
and the stocks we have analyzed, we must bear in mind the fact that a non-Gaussian copula cannot be 
rejected. In particular, we have shown that a Student's copula can always be mistaken for a Gaussian copula 
if its number of degrees of freedom is sufficiently large. Then, depending on the correlation coefficient, the 
Student's copula can predict a non-negUgible tail dependence which is completely missed by the Gaussian 
copula assumption. In other words, the Gaussian copula predicts no tail dependences and therefore does 
not account for extreme events that may occur simultaneously but nevertheless too rarely to modify the test 
statistics. To quantify the probability for neglecting such events, we have investigated the situations when 
one is unable to distinguish between the Gaussian and Student's copulas for a given number of degrees of 
freedom. Our study leads to the conclusion that it may be very dangerous to embrace blindly the Gaussian 
copula hypothesis when the correlation coefficient between the pair of asset is too high as the tail dependence 
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neglected by the Gaussian copula can be as large as 0.6. In this respect, the case of the Swiss Franc and 
the German Mark is striking. The test values p obtained are very significant (about 33%), so that we cannot 
mistake the Gaussian copula for a Student's copula with less than 5-7 degrees of freedom. However, their 
correlation coefficient is so high (p = 0.9) that a Student's copula with, say u = 30 degrees of freedom, still 
has a large tail dependence. 

This remark shows that it is highly desirable to develop tests that are specific to the detection of a possible 
tail dependence between two time series. This task is very difficult but we hope to report useful progress in 
the near future. Another approach is to test for other non-Gaussian copulas, such as the Student's copula. 
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Figure 2: Maximum value of the correlation coefficient p as a function of v, below which the tail de- 
pendence \u{p) of a Student' copula is smaller than a given small value, here taken equal to Xu{p) = 
1%, 2.5%, 5% and 10%. The choice Xy{p) = 5% for instance corresponds to 1 event in 20 for which the 
pair of variables are asymptotically coupled. At the 1 — \i,{p) probability level, values of A < \v{p) are 
undistinguishable from 0, which means that the Student's copula can be approximated by a Gaussian copula. 
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Figure 3: Cumulative "distribution of probabilities" D{p) = D{p{d)) obtained as the fraction of Student's 
pairs with i/ = 4 degrees of freedom that exhibit the value p for the probability that Gaussian vectors 
can have a similar or larger distance. See the text for a detailled description of how D{p) is defined and 
constructed. Each panel corresponds to one of the four distances dj, i G {1,2,3,4}, defined in the text 
by equations (33-3^). In each panel, we construct the cumulative "distribution of probabilities" D{p) for 5 
different values of the correlation coefficient p = 0.1, 0.3, 0.5, 0.7 and 0.9 of the Student's copula. 
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Figure 4: Same as figure ^ for Student's distributions with u = 20 degrees of freedom. 
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0.84 


0.86 
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0.91 
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0.9 






di 
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0.99 






da 
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0.98 
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0.99 


0.99 
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0.99 
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Table 1: The values P95%(z^, p) shown in this table give the maximum values that the probability p should 
take in order to be able to reject the hypothesis that a Student's copula with u degrees and correlation p is 
undistinguishable from a Gaussian copula at the 95% confidence level. ^95% is the abscissa corresponding 
to the ordinate D{pg^%) = 0.95 shown in figures I and I p is the probability that pairs of Gaussian 
random variables with the correlation coefficient p have a distance (between the distribution of and the 
theoretical distribution) equal to or larger than the corresponding distance obtained for the Student's 
vector time series. A small p corresponds to a clear distinction between Student's and Gaussian vectors, 
as it is improbable that Gaussian vectors exhibit a distance larger than found for the Student's vectors. 
Different values of the number u of degrees of freedom ranging from 1/ = 3 to 1/ = 50 and of the correlation 
coefficient p = 0.1 to 0.9 are shown. Let us take for instance the example with 1/ = 4 and p = 0.3. The 
table indicates that p should be less than about 0.3 (resp. 0.2) according to the distances di and ds (resp. d2 
and d^) for being able to distinguish this Student's copula from the Gaussian copula at the 95% confidence 
level. This means that less than 20 — 30% of Gaussian vectors should have a distance for their larger than 
the one found for the Student's. See text for further explanations. 
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Figure 5: Graph of the minimun significance level (1 — ^95%) necessary to distinguish the Gaussian copula 
hypothesis Hq from the hypothesis of a student copula with u degrees of freedom, as a function of l/u, for 
a given distance di and various correlation coefficients p = 0.1, 0.3, 0.5, 0.7 and 0.9. 
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1.80e-03 


6.00e-04 


1.30e-03 
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2.26e-02 


1.33e-01 


l.OOe-01 


1.51e-01 


DEM 
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4.25e-01 


6.77e-01 


6.22e-01 
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7.94e-01 


6.23e-01 
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UKP 


0.15 


5.22e-01 


6.23e-01 


3.21e-02 


7.05e-01 



Table 2: Each row gives the statistics of our test for each of the 15 pairs of currencies over a 10 years 
time interval from January 25, 1989 to December 31, 1998. The column p gives the empirical correlation 
coefficient for each pair determined as in section \A and defined in (31). The columns di,d2, and d^ 
gives the probability to obtain, from the Gaussian hypothesis, a deviation between the distribution of the 
and the ^^-distribution with two degrees of freedom larger than the observed one for the currency pair 



according to the distances di-d^ defined by (p3b-( 
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1.12e-01 
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3.54e-01 


3.45e-01 
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4.34e-01 


8.62e-01 


3.13e-02 


8.67e-01 



Table 3: Same as table ^ for currencies over a 5 years time interval from January 25, 1989 to Januay 11, 
1994. 
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0.92 


3.15e-01 


3.11e-01 
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3.41e-01 
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l.lOe-02 


3.87e-02 


1.05e-01 


3.34e-02 


CHF 


UKP 
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7.63e-02 


2.14e-01 
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4.62e-01 
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5.00e-04 


1.20e-03 


5.34e-02 


1.20e-03 
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UKP 


0.11 


5.94e-01 


7.44e-01 


6.95e-01 


7.82e-01 


THA 


UKP 


0.12 


1.26e-02 


7.66e-02 


1.19e-01 


6.51e-02 



Table 4: Same as table ^ for currencies over a 5 years time interval from January 12, 1994 to December 31, 
1998. 



30 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 



Figure 7: For each distance di-d^ defined in equations (|33D-(|36[), this figure shows the number of currency 
pairs that give a given p (shown on the abscissa) within a bin interval of width 0.05 for different currencies 
over a 10 years time interval from January 25, 1989 to December 31, 1998. p is the probability that pairs of 
Gaussian random variables with the same correlation coefficient p have a distance (between the distribution 
of and the theoretical chi^ distribution) equal to or larger than the corresponding distance obtained for 
each currency pair. A clustering close to the origin signals a significant rejection of the Gaussian copula 
hypothesis. 
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Figure 8: Same as figure 0for currencies over a 5 years time interval from January 25, 1989 to January 11, 
1994. 
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Figure 9: Same as figure ^ for currencies over a 5 years time interval from January 12, 1994 to December 
1998. 
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Table 5: Same as table ^for metals over a 9 years time interval from January 4, 1989 to December 30, 
1997. 
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Figure 10: The upper panel represents the graph of the function /3(t) defined in (^) used in the definition 
of the distance ^3 for the couple Swiss Franc/German Mark as a function of time t, over the time intervals 
from January 25, 1989 to January 11, 1994 and from January 12, 1994 to December 31, 1998. The two 
lower panels represent the scatter plot of the return of the German Mark versus the return of the Swiss Franc 
during the two previous time periods. The circled dot, in each figure, shows the pair of returns responsible 
for the largest deviation of /a during the considered time interval. 
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Figure 11: Same as figure ^ for metals over a 9 years time interval from January 4, 1989 to December 30, 
1997. 
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d2 


d3 


d4 


amat 


pfe 


0.15 


7.41e-02 


1.12e-01 


8.40e-03 


1.14e-01 
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sunw 


0.28 
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4.87e-01 
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0.33 
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5.44e-02 


9.07e-02 
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0.28 


4.79e-01 


3.77e-01 


1.52e-01 


3.75e-01 
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0.20 


3.20e-03 


O.OOe+00 


6.02e-02 


O.OOe+00 



Table 6: Same as table § for stocks over a 10 years time interval from February 8, 1991 to December 29, 
2000. 
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8 


59e-01 


intc 


mrk 


0.13 


8.59e-01 


8 


21e-01 


5 


48e-02 


8 


65e-01 


ko 


sunw 


0.20 


3.53e-01 


5 


98e-01 


4 


51e-01 


6 


79e-01 


mdt 


t 


0.14 


9.09e-01 


8 


98e-01 


1 


68e-01 


9 


15e-01 


mrk 


xom 


0.12 


5.36e-01 


6 


21e-01 


1 


20e-01 


6 


18e-01 


msft 


sunw 


0.40 


2.68e-01 


1 


38e-01 


1 


60e-01 


1 


39e-01 


pfe 


wmt 


0.23 


2.94e-01 


4 


66e-01 


1 


41e-01 


5 


23e-01 


t 


wcom 


0.19 


7.92e-01 


9 


36e-01 


4 


95e-02 


9 


49e-01 


txn 


wcom 


0.23 


9.10e-01 


9 


83e-01 


1 


OOe-01 


9 


93e-01 


wmt 


xom 


0.22 


7.16e-01 


6 


71e-01 


7 


35e-02 


6 


89e-01 



Table 7: Same as table ^ for stocks over a 5 years time interval from February 8, 1991 to January 18, 1996. 
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P 




di 




d2 




da 


d4 


amat 


pfe 


0.19 


2 


96e-01 


3 


39e-01 


3 


lOe-02 


3.95e-01 


c 


sunw 


0.31 


7 


12e-01 


6 


58e-01 


9 


47e-01 


7.08e-01 


f 


ge 


0.34 


3 


80e-01 


2 


36e-01 


3 


22e-01 


2.18e-01 


gm 


ibm 


0.21 


3 


05e-02 


1 


79e-01 


2 


37e-01 


2.19e-01 


hwp 


sbc 


0.11 


3 


47e-01 


6 


13e-01 


7 


17e-01 


6.40e-01 


intc 


mrk 


0.20 


1 


31e-01 


2 


06e-01 


5 


57e-01 


2.05e-01 


ko 


sunw 


0.10 


6 


89e-01 


3 


44e-01 


8 


59e-01 


3.52e-01 


mdt 


t 


0.19 


4 


28e-01 


6 


lle-01 


5 


Ole-01 


5.79e-01 


mrk 


xom 


0.23 


3 


57e-01 


6 


64e-01 


1 


13e-01 


7.38e-01 


msft 


sunw 


0.46 


5 


79e-02 


7 


60e-02 


8 


OOe-04 


8.07e-02 


pfe 


wmt 


0.30 


2 


31e-01 


2 


12e-01 


5 


59e-01 


1.98e-01 


t 


wcom 


0.33 


1 


20e-01 


1 


37e-01 


1 


73e-01 


1.40e-01 


txn 


wcom 


0.31 


5 


63e-01 


4 


06e-01 


4 


64e-01 


4.17e-01 


wmt 


xom 


0.19 


1 


61e-01 


5 


38e-02 


3 


78e-02 


4.94e-02 



Table 8: Same as table § for stocks over a 5 years time interval from January 19, 1996 to December 29, 
2000. 
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Figure 12: Same as figure ^ for stocks over a 10 years time interval from February 8, 1991 to December 
29, 2000. 
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Figure 13: Same as figure ^ for stocks over a 5 years time interval from February 8, 1991 to January 18, 
1996. 
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Figure 14: Same as figure ^ for stocks over a 5 years time interval from January 19, 1996 to December 30, 
2000. 
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P 


di 




d2 




ds 




d4 






hwp 


ibm 


0.34 


3.36e 


-01 


2.26e 


-01 


3.33e 


-01 


2.35e 


-01 




hwp 


intc 


0.46 


3.01e 


-01 


A 

4.73e 


-01 


5.12e 


-01 


5.21e 


-01 




hwp 


msft 


f\ A ^ 

0.41 


7.63e 


-01 


A 

4.72e 


-01 


3.23e 


-01 


A n 

4.53e 


-01 




hwp 


sunw 




z.yoe 


-Ul 


z.yoe 


-Ul 


/ .Doe 


ni 

-Ul 


J. j'4-e 


ni 

-Ul 


Time interval from 


ibm 


intc 


0.30 


4.81e 


-01 


3.54e 


-01 


4.18e 


-02 


3.34e 


-01 


Frebruary 8, 1991 


ibm 


msft 


0.24 


3.93e 


-01 


6.61e 


-01 


5.88e 


-01 


7.07e 


-01 


to January 18, 1996 


ibm 


sunw 


0.29 


9.65e 


-01 


9.71e 


-01 


3.46e 


-01 


9.86e 


-01 




intc 


msft 


0.47 


2.59e 


-01 


1.45e 


-01 


4.50e 


-02 


1.53e 


-01 




intc 


sunw 


0.40 


4.81e 


-01 


3.86e 


-01 


4.47e 


-02 


3.95e 


-01 




msft 


sunw 


0.40 


2.68e 


-01 


1.38e 


-01 


1.66e 


-01 


1.39e 


-01 






P 


di 




d2 




ds 




d4 






hwp 


ibm 


0.46 


2.02e 


-02 


3.21e 


-02 


9.60e 


-03 


3.96e 


-02 




hwp 


intc 


0.44 


2.88e 


-02 


4.89e 


-02 


6.00e 


-04 


5.80e 


-02 




hwp 


msft 


0.37 


5.23e 


-02 


9.88e 


-02 


3.36e 


-01 


1.18e 


-01 




hwp 


sunw 


0.45 


5.66e 


-01 


5.65e 


-01 


1.08e 


-01 


6.23e 


-01 


Time interval from 


ibm 


intc 


0.43 


5.34e 


-02 


3.31e 


-02 


1.68e 


-02 


2.44e 


-02 


January 19, 1996 to 


ibm 


msft 


0.39 


l.OOe 


-02 


9.50e 


-03 


2.28e 


-02 


8.80e 


-03 


December 29, 2000 


ibm 


sunw 


0.46 


2.35e 


-01 


1.56e 


-01 


3.38e 


-01 


1.49e 


-01 




intc 


msft 


0.57 


3.18e 


-01 


1.61e 


-01 


1.15e 


-01 


1.71e 


-01 




intc 


sunw 


0.50 


6.68e 


-02 


3.55e 


-02 


l.OOe 


-04 


4.37e 


-02 




msft 


sunw 


0.46 


5.79e 


-02 


7.60e 


-02 


8.00e 


-04 


8.07e 


-02 



Table 9: Same as table ^ for stocks belonging to the informatic sector, over two time intervals of 5 years. 
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